A Hybrid Method for Protein Secondary Structure Prediction

نویسندگان

  • Shing-Hwang Doong
  • Chi-Yuan Yeh
چکیده

Protein secondary structure can be used to help determine the tertiary structure via the fold recognition. Predicting the secondary structure from the protein sequence has attracted the attention of many researchers. Support Vector Machine (SVM) is a new learning algorithm based on statistical learning theory that has been successfully applied to the protein secondary structure prediction problem. However, the algorithm takes a long time to train the prediction model with a large data set. It becomes important to revise the method so that the time performance is improved while the accuracy performance is maintained. In this study, we implement a genetic algorithm to cluster the data set before the structure classification is predicted. Using position specific scoring matrix as part of the input, the hybrid method achieves good performances through 7-fold cross validation tests on a set of 513 non-redundant protein sequences (the CB513 data set). The result is comparable to that of the existing best prediction, yet the time spent is substantially reduced. Keyword: Secondary structure prediction, support vector machine, clustering.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Protein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches

DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...

متن کامل

Prediction of Secondary Structure of Citrus Viroids Reported from Southern Iran

Abstract Viroids are smallest, single-stranded, circular, highly structured plant pathogenic RNAs that do not code for any protein. Viroids belong to two families, the Avsunviroidae and the Pospiviroidae. Members of the Pospiviroidae family adopt a rod-like secondary structure. In this study the most stable secondary structures of citrus viroid variants that reported from Fars province wer...

متن کامل

Improved Protein Secondary Structure Prediction using a Intelligent HSVM Method with a New Encoding Scheme

Prediction of protein secondary structures is an important problem in bioinformatics and has many applications. Successful secondary structure predictions provide a starting point for direct tertiary structure modelling, and also can significantly improve sequence analysis and sequence-structure threading for aiding in structure and function determination. Now many secondary structure predictio...

متن کامل

A knowledge-based hybrid method for protein secondary structure prediction based on local prediction confidence

Motivation: In our previous approach, we proposed a hybrid method for protein secondary structure prediction, called HYPROSP, which combined our proposed knowledge-based prediction algorithm PROSP and PSIPRED. The knowledge base constructed for PROSP contains small peptides together with their secondary structural information. The hybrid strategy of HYPROSP uses a global quantitative measure, m...

متن کامل

HYPROSP: a hybrid protein secondary structure prediction algorithm--a knowledge-based approach.

We develop a knowledge-based approach (called PROSP) for protein secondary structure prediction. The knowledge base contains small peptide fragments together with their secondary structural information. A quantitative measure M, called match rate, is defined to measure the amount of structural information that a target protein can extract from the knowledge base. Our experimental results show t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004